Chinese Restaurant Process for cognate clustering: A threshold free approach

نویسنده

  • Taraka Rama
چکیده

In this paper, we introduce a threshold free approach, motivated from Chinese Restaurant Process, for the purpose of cognate clustering. We show that our approach yields similar results to a linguistically motivated cognate clustering system known as LexStat. Our Chinese Restaurant Process system is fast and does not require any threshold and can be applied to any language family of the world.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tracklet clustering for robust multiple object tracking using distance dependent Chinese restaurant processes

To contrive an accurate and efficient strategy for object detection–object track assignment problem, we present a tracklet clustering approach using distance dependent Chinese restaurant processes (ddCRPs), which employ a two-level robust object tracker. The first level is an ordinary tracklet generator that obtains short yet reliable tracklets. In the second level, we cluster the tracklets ove...

متن کامل

A Similarity-based Chinese Restaurant Process for Social Event Detection

In this paper, we present our approach for the Social Event Detection task of Medieval 2013 [2]. The goal of the task was to group similar multimedia items into event clusters, based on their metadata (e.g. title, description, tags). Since the number of the event clusters in the test set was not known in advance, we formulated a non-parametric algorithm which resembles the Dirchlet process clus...

متن کامل

A Gibbs Sampler for Spatial Clustering with the Distance-dependent Chinese Restaurant Process

The distance-dependent Chinese Restaurant Process (dd-CRP) is a flexible class of distributions over partitions which was recently introduced by [1, 2]. In their description and experiments Blei and Frazier focus on the sequential setting such as clustering over time. Their Gibbs sampler, while general in nature, does not explicitly handle the case of non-sequential (also called spatial) cluste...

متن کامل

Temporally-Reweighted Chinese Restaurant Process Mixtures for Clustering, Imputing, and Forecasting Multivariate Time Series

This article proposes a Bayesian nonparametric method for forecasting, imputation, and clustering in sparsely observed, multivariate time series. The method is appropriate for jointly modeling hundreds of time series with widely varying, non-stationary dynamics. Given a collection of N time series, the Bayesian model first partitions them into independent clusters using a Chinese restaurant pro...

متن کامل

Reducing over-clustering via the powered Chinese restaurant process

Dirichlet process mixture (DPM) models tend to produce many small clusters regardless of whether they are needed to accurately characterize the data this is particularly true for large data sets. However, interpretability, parsimony, data storage and communication costs all are hampered by having overly many clusters. We propose a powered Chinese restaurant process to limit this kind of problem...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1610.06053  شماره 

صفحات  -

تاریخ انتشار 2016